R Basics

Rüçhan Ekren
Jean Monlong
Margot Zahm

Why R?

Why R?

Simple

  • Interpretative language (no compilation needed)
  • No manual memory management
  • Vectorized

Free

  • Widely used, vast community of R users
  • Good life expectancy

Why R?

Flexible

  • Open-source: anyone can see/create/modify
  • Multiplatform: Windows, Mac, Unix… It works everywhere

Trendy

  • More and more packages
  • More and more popular among data scientists and (now) biologists

Lots of bioinfo packages

Workshop Setup

  • Open

Workshop Setup

  • Open

Logo

Workshop Setup

Open a new R script file (File > New File > R Script)

Workshop Setup

Console

  • Where R is running
  • You can write and run the commands directly here
  • Your command executes when you press Enter

Workshop Setup

Console

Script

  • A text file with commands. Extension .R
  • To keep a trace of your analysis
  • Highly recommended
  • Run commands from a script to the console with Run button

Workshop Setup

Console

Script

Tracking panel

  • List all variables you generated
  • An history of the commands you ran

Workshop Setup

Console

Script

Tracking panel

Multipurpose panel

Check files in your computer, see plots, manage packages, read help section of a function.

Workshop Setup

Console

Script

Tracking panel

Multipurpose panel

Caution

Write everything you do in scripts to avoid loosing your work.

When you get an error

  1. Read the command, look for typos
  2. Read the error message
    1. and 2. again
  3. Raise your hand, someone will assist you

Tip

Solving errors is an important skill to learn.

Objects

Objects - Overview

Unit type

  • numeric e.g. numbers
  • logical Binary two possible values
  • character e.g. words between "
  • comment: line starting by #
# This is a comment line
# I can write everythin I want

Tip

Comment your script to help you remember what you have done.

Objects - Overview

Complex type

  • vector: Ordered collection of elements of the same type
  • list: Flexible container, mixed type possible. Recursive

Objects - Overview

Complex type

  • matrix: Table of elements of the same type
  • data.frame: Table of mixed type elements

Note

These are the basic complex types. It exists a lot of different complex objects which mix all these basic objects.

Objects - Naming conventions

  • Use letters, numbers, dot or underline characters
  • Start with letter (or the dot not followed by a number)
  • Some names are forbidden (ex. if, else, TRUE, FALSE)
  • Correct: valid.name, valid_name, valid2name3
  • Incorrect: valid name, valid-name, 1valid2name3

Tip

Avoid random names such as var1, var2. Use significant names: gene_list, nb_elements

Objects - Assign a value

The name of the object followed by the assignment symbol and the value.

Objects - Arithmetic operators

You can use operators on objects to modify them. Depending on the object format, operators have different behaviors and some are forbidden.

  • addition: +
  • subtraction: -
  • multiplication: *
  • division: /
  • exponent: ^ or **
  • integer division: %/%
  • modulo: %%

Objects - Arithmetic operators

Exercise

  1. Create a numeric object
  2. Multiply it by 6
  3. Add 21
  4. Divide it by 3
  5. Subtract 1
  6. Halve it
  7. Subtract its original value

Objects - Arithmetic operators

Correction

Objects - Arithmetic operators

Exercise

Try to raise errors using operators.

Objects - Function

  • A function is a tool to create or modify an object
  • Format: function_name(object, parameter1 = ..., parameter2 = ...)
  • Read the help manual to know more about a function (help, ? or F1)

Note

Some functions are in the default installation of R. Other functions come from packages. You can also create your own functions.

Vectors

Vectors

vector construction

  • c() Concatenate function
  • 1:10 Vector with numbers from 1 to 10

Vectors

vector construction

  • c() Concatenate function
  • 1:10 Vector with numbers from 1 to 10

Extra

  • seq Create a sequence of numbers
  • rep Repeat elements several times
  • runif Simulate random numbers from Uniform distribution. Same for rnorm, rpois

Exercise - Create some vectors

Instructions

  • Create a vector with 7 numeric values
  • Create a vector with 7 character values

Vectors

Manipulation

Using index/position between []

Characterization

  • length() Number of elements in the vector
  • names() Get or set the names of the vector’s value

Vectors

Manipulation

  • sort() Sort a vector
  • sample() Shuffle a vector
  • rev() Reverse a vector

Extra

  • sort()/sample() Explore extra parameters
  • order() Get the index of the sorted elements

Vectors

Exploration

  • head()/tail() Print the first/last values
  • summary() Summary statistics
  • min()/max()/mean()/median()/var() Minimum, maximum, average, median, variance
  • sum Sum of the vector’s values

Extra

  • log/log2/log10 Logarithm functions
  • sqrt Square-root function

Vectors

Arithmetic operators

  • Simple arithmetic operations over all the values of the vector
  • Or values by values when using vectors of same length
  • Arithmetic operations: +, -, *, /
  • Other exist but let’s forget about them for now

Exercise

Instructions

  1. Create a vector with 100 random numeric values (hint: runif or rnorm)
  2. Subtract the average of those values
  3. Divide by the standard deviation
  4. Multiply all the values by 10
  5. Add 100 to all the values
  6. Compute summary statistics (minimum/maximum, median, mean)
  7. Compute the standard deviation of the new values
  8. Plot the histogram the new values with the hist function